Call:
lm(formula = SysBP ~ Infection, data = ICU)
Residuals:
Min 1Q Median 3Q Max
-87.452 -18.672 -2.062 18.548 117.328
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 138.672 2.986 46.440 < 2e-16 ***
Infection -15.220 4.608 -3.303 0.00113 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 32.16 on 198 degrees of freedom
Multiple R-squared: 0.05223, Adjusted R-squared: 0.04744
F-statistic: 10.91 on 1 and 198 DF, p-value: 0.001134
Write up the interpretation of the model, including:
Intercept value, test statistic, p-value
Intercept interpretation (in words)
Slope value, test statistic, p-value
Slope interpretation (in words)
R^2 value, test statistic, p-value
R^2 interpretation (in words)
What is the predicted blood pressure for someone with a suspected infection? What is the predicted blood pressure for someone without a suspected infection?
Calculate the predicted and residual values for each person and add them to the original dataset. Does it looks like the assumptions are satisfied? Make plots to help you decide:
Residual vs predictor
Q-Q plot of residuals
Histogram of residuals (all together and separately for the two groups)
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Does blood pressure differ between those with a suspected infection and those without? How? (Brief, simple words, no jargon, no statistics.)
1 Plot.
int <-summary(m1)$coefficients[1]slo <-summary(m1)$coefficients[2]ggplot(data = ICU,aes(x = Infection, y = SysBP)) +geom_point(alpha =0.3) +annotate("text", x =0.5, y =250, label =bquote(~hat(Y) == .(round(int, 3)) + .(round(slo, 3))~"* Infection"),size =6)
Warning in is.na(x): is.na() applied to non-(list or vector) of type 'language'
Source Code
---title: "BTS 510 Lab 6"format: html: embed-resources: true self-contained-math: true html-math-method: katex number-sections: true toc: true code-tools: true code-block-bg: true code-block-border-left: "#31BAE9"---```{r}#| label: setupset.seed(12345)library(tidyverse)library(Stat2Data)theme_set(theme_classic(base_size =16))```## Learning objectives* Describe the logic of **linear regression** * Describe the **assumptions** of linear regression* Briefly **present results** of linear regression* Use a **regression equation** to summarize a model* Compare and contrast **observed**, **predicted**, and **residual** values## Data * `ICU` data from the **Stat2Data** package * `ID`: Patient ID code * `Survive`: 1 = patient survived to discharge or 0 = patient died * `Age`: Age (in years) * `AgeGroup`: 1 = young (under 50), 2 = middle (50-69), 3 = old (70+) * `Sex`: 1 = female or 0 = male * `Infection`: 1 = infection suspected or 0 = no infection * `SysBP`: Systolic blood pressure (in mm of Hg) * `Pulse`: Heart rate (beats per minute) * `Emergency`: 1 = emergency admission or 0 = elective admission## Tasks1. **Does blood pressure differ between those with a suspected infection and those without?** *Fit a linear regression to answer this question.*```{r}library(Stat2Data)data(ICU)m1 <-lm(data = ICU, SysBP ~ Infection)summary(m1)```2. Write up the **interpretation** of the model, including: * Intercept value, test statistic, $p$-value * Intercept interpretation (in words) * Slope value, test statistic, $p$-value * Slope interpretation (in words) * $R^2$ value, test statistic, $p$-value * $R^2$ interpretation (in words)3. What is the **predicted blood pressure** for someone *with a suspected infection*? What is the **predicted blood pressure** for someone *without a suspected infection*?* Predict function```{r}newdat <-data.frame(Infection =c(0,1))predict(m1, newdat)```* Use R as a smart calculator```{r}bp_0 <-summary(m1)$coefficients[1] +summary(m1)$coefficients[2] *0bp_0bp_1 <-summary(m1)$coefficients[1] +summary(m1)$coefficients[2] *1bp_1```* Use R as a dumb calculator```{r}bp_0d <-138.672-15.220*0bp_0dbp_1d <-138.672-15.220*1bp_1d```4. Calculate the **predicted** and **residual** values for each person and add them to the original dataset. **Does it looks like the assumptions are satisfied?** Make plots to help you decide: * Residual vs predictor * Q-Q plot of residuals * Histogram of residuals (all together and separately for the two groups)```{r}ICU$pred1 <-fitted(m1)ICU$resi1 <-residuals(m1)ggplot(data = ICU, aes(x = Infection, y = resi1)) +geom_point(alpha =0.3, size =2) +geom_smooth(method ="lm")ggplot(data = ICU,aes(sample = resi1)) +stat_qq() +stat_qq_line()ggplot(data = ICU,aes(x = resi1)) +geom_histogram()ggplot(data = ICU,aes(x = resi1)) +geom_histogram() +facet_grid(cols =vars(Infection))```5. **Does blood pressure differ between those with a suspected infection and those without? How?** (Brief, simple words, no jargon, no statistics.) # Plot.```{r}int <-summary(m1)$coefficients[1]slo <-summary(m1)$coefficients[2]ggplot(data = ICU,aes(x = Infection, y = SysBP)) +geom_point(alpha =0.3) +annotate("text", x =0.5, y =250, label =bquote(~hat(Y) == .(round(int, 3)) + .(round(slo, 3))~"* Infection"),size =6)```